Genome-wide localization of small molecules
نویسندگان
چکیده
A vast number of small-molecule ligands, including therapeutic drugs under development and in clinical use, elicit their effects by binding specific proteins associated with the genome. An ability to map the direct interactions of a chemical entity with chromatin genome-wide could provide new and important insights into chemical perturbation of cellular function. Here we describe a method that couples ligand-affinity capture and massively parallel DNA sequencing (Chem-seq) to identify the sites bound by small chemical molecules throughout the human genome. We show how Chem-seq can be combined with ChIP-seq to gain unique insights into the interaction of drugs with their target proteins throughout the genome of tumor cells. These methods provide a powerful approach to enhance understanding of therapeutic action and characterize the specificity of chemical entities that interact with DNA or genome-associated proteins. The ability to map the locations of proteins throughout the genome has had a profound impact on our understanding of a wide range of normal and disease biology. For example, discovery of the genome-wide location of proteins using ChIP-seq has allowed global 6These authors contributed equally to this work Accession Codes: Chem-seq and ChIP-seq data, GEO: GSE44098 and GSE43743, respectively. Author Contributions: J.J.M., P.B.R., J.E.B. and R.A.Y. conceived of the Chem-seq method, L.A. and M.G.G. developed the method, L.A. generated bio-JQ1 and bio-psoralen Chem-seq data, M.G.G. generated bio-AT7519 Chem-seq data, Z.P.F. developed computational methods and analyzed the data, J.Q. and J.J.M. synthesized biotinylated derivatives of chemical probes, J.Q. performed protein biochemistry, L.A., P.B.R. and J.L. generated ChIP-seq data for BRD2, BRD3, BRD4, RNA pol II, CDK7, CDK8 and CDK9, W.B.S. generated cellular proliferation data, A.A.S. contributed to optimize Chem-seq, T.I.L. provided advice on method development, and J.E.B. and R.A.Y. supervised the research. Competing Financial Interests: J.E.B. and R.A.Y. are founders of Syros Pharmaceuticals. J.J.M., P.B.R., M.G.G. and J.L. are employees of Syros Pharmaceuticals. NIH Public Access Author Manuscript Nat Biotechnol. Author manuscript; available in PMC 2014 October 08. Published in final edited form as: Nat Biotechnol. 2014 January ; 32(1): 92–96. doi:10.1038/nbt.2776. N IH -P A A uhor M anscript N IH -P A A uhor M anscript N IH -P A A uhor M anscript mapping of the key transcription factors and chromatin regulators that control gene expression programs in various cells, the sites that act as origins of DNA replication, and regions of the genome that form euchromatin and heterochromatin1-6. Models of the transcriptional regulatory circuitry that controls normal and disease cell states have emerged from genome-wide data7-10. An ability to map the global interactions of a chemical entity with chromatin genome-wide could provide new insights into the mechanisms by which a small molecule influences cellular functions. Many DNA-associated processes are targeted for disease therapy, including transcription, modification, replication and repair11-16. Ligand-affinity methodologies have greatly contributed to our understanding of drug and ligand function at the genome, and have led to the identification of numerous gene regulatory drug targets17-20. There have been initial efforts to map the sites of interaction of metabolic compounds in the yeast genome21, but it would be ideal to have a method that allows investigators to determine how small-molecule therapeutics interact with the human genome. We describe here a method based on chemical affinity capture and massively parallel DNA sequencing (Chem-seq) that allows investigators to identify genomic sites where small chemical molecules interact with their target proteins or DNA (Fig. 1a). The Chem-seq method is similar to that employed for ChIP-seq, except that Chem-seq uses retrievable synthetic derivatives of a compound of interest to identify sites of genome occupancy whereas ChIPseq uses antibodies against specific proteins for this purpose. We used Chem-seq to investigate the genome-wide binding of the bromodomain inhibitor JQ1 to the BET bromodomain family members BRD2, BRD3 and BRD4 in MM1.S multiple myeloma cells. JQ1 was previously been shown to bind all three co-activator proteins and to inhibit growth of MM1.S and other tumor cells13, 22-27. We first investigated how BRD2, BRD3 and BRD4 occupy the genome of MM1.S cells using ChIP-Seq (Supplementary Fig. 1). All three proteins were found to be associated with actively transcribed genes (Supplementary Fig. 1a). Inspection of individual gene tracks (Supplementary Fig. 1b) and analysis of global genome occupancy (Supplementary Fig. 1c) showed that most core promoter elements of active genes were co-occupied by BRD2, BRD3 and BRD4 together with RNA polymerase II, the Mediator coactivator and histone H3K27Ac. In contrast, enhancers, which are occupied by histone H3K27Ac and Mediator, were preferentially occupied by BRD4, with lower relative levels of BRD2 and BRD3. To investigate the interaction of JQ1 with chromatin genome-wide, we used the Chem-seq technique (Fig. 1a) with a biotinylated derivative of JQ1 (bio-JQ1, Fig. 1b). Enantioretentive substitution at C-6 of the JQ1 diazepine allowed coupling of a poly-ethylene glycol spacer with appended biotin feature. The potency of bio-JQ1 binding to the first bromodomain of BRD4 was nearly equivalent to the unbiotinylated compound, as determined by both differential scanning fluorimetry and isothermal titration calorimetry (Supplementary Fig. 2). Consistent with this, bio-JQ1 had only slightly reduced bioactivity in MM1.S cells relative to JQ1 (Fig. 1c). We initially treated living cells with bio-JQ1 and cross-linked proteins to DNA with formaldehyde (in vivo Chem-seq, Fig. 1a, upper panel). Cells were then lysed, sonicated to shear the DNA and streptavidin beads were used to isolate biotinylated ligand and associated chromatin fragments. Massively parallel sequencing was Anders et al. Page 2 Nat Biotechnol. Author manuscript; available in PMC 2014 October 08. N IH -P A A uhor M anscript N IH -P A A uhor M anscript N IH -P A A uhor M anscript used to identify enriched DNA fragments, and these sequences were mapped to the genome to reveal sites bound by the small molecule probe. In addition, we developed an in vitro version of this method, which allows analysis of biotinylated molecules with potentially limited cell permeability (in vitro Chem-seq, Fig. 1a, lower panel). To this end, MM1.S cells were fixed and the derived sonicated lysate incubated with biotinylated JQ1 to enrich for bound chromatin regions in vitro. We found that both in vivo and in vitro Chem-seq produced essentially the same result: the genomic sites bound by biotinylated JQ1 are highly similar to the sites occupied by BRD2, BRD3 and BRD4 (Fig. 1d, e). This was further confirmed by inspection of data at individual genes with pivotal roles in myeloma biology, such as CCND2 (Fig. 1f). By contrast, a functionally inactive enantiomer of bio-JQ1 (bio-JQ1R, Supplementary Fig. 3a) did not produce significant Chem-seq signals (Supplementary Fig. 3b, c). These results indicate that both live-cell and cell-lysate based Chem-seq approaches (Fig. 1a) can be used to uncover the interactions of small molecules with their chromatin targets across the human genome. Of note, JQ1 is known to displace BET bromodomains from the genome, but the ability to detect the bio-JQ1/BRD complex on chromatin is likely made possible by covalent tethering of these proteins to chromatin during fixation (Supplementary Fig. 4). We next investigated the extent to which Chem-seq and ChIP-seq signals overlap (Fig. 2). The pattern of JQ1 occupancy was best associated with the pattern of BRD4 occupancy (Fig. 2a). Pearson correlation analysis also showed that bio-JQ1 signals were most highly correlated with BRD4, somewhat less frequently with BRD2 and much less frequently with BRD3 (Fig. 2b). We then developed a generalized linear model (GLM) to identify genomic regions with differential signal between bio-JQ1 Chem-seq and each of the BRD ChIP-seq datasets. We found that bio-JQ1 co-occupied nearly all regions (>99%) with BRD4 genomewide across triplicate datasets, bio-JQ1 and BRD2 co-occupied 96% of all genomic sites, and bio-JQ1 and BRD3 co-occupied 63% of all genomic sites (Fig. 2c). Inspection of gene tracks for regions differentially occupied by bio-JQ1 and the three BET proteins provided visual confirmation that bio-JQ1 tends to co-occupy enhancers where there are substantial BRD4 signals and lower signals for BRD2 and BRD3 (Fig. 2d). The pattern of BRD3 genome occupancy differed most from that of the other two BET proteins (Fig. 2a–c), and this was due to pronounced signals at a subset of core promoter sites (Fig 2e). Similar results were obtained with an alternative BRD3 ChIP antibody directed against a different epitope of this protein (Supplementary Fig. 5). Taken together, these results indicate that the pattern of JQ1 occupancy of chromatin is most correlated with that of BRD4 in MM1.S cells, consistent with the relative affinities of JQ1 for these BET proteins previously established in vitro27. To extend the Chem-seq method to other drug classes, we initially focused on AT7519, an inhibitor of the cyclin-dependent kinase CDK9 (ref. 28), which is associated with the transcription apparatus at promoters. CDK9 phosphorylation of RNA polymerase II and various pause control factors stimulates active elongation29. We first confirmed that CDK9 co-occupies the promoters of active genes with RNA polymerase II by using ChIP-seq (Fig. 3a, b). CDK9 is a core component of the positive transcription elongation factor, p-TEFb29, Anders et al. Page 3 Nat Biotechnol. Author manuscript; available in PMC 2014 October 08. N IH -P A A uhor M anscript N IH -P A A uhor M anscript N IH -P A A uhor M anscript and its inhibition would be expected to affect the levels of elongating RNA polymerase II, which is located across the body of genes, to a much greater extent than the levels of initiating RNA polymerase located at the transcription start site. Indeed, treatment of MM1.S cells with AT7519 was found to cause a reduction in the level of elongating RNA polymerase II based on examination of individual gene tracks (Fig. 3c) and on analysis of the ratio of initiating versus elongating RNA polymerase II molecules at active genes throughout the genome (Fig. 3d). We next generated a retrievable biotinylated derivative of AT7519 (Fig. 3e). The biotinylated compound was found to have reduced ability to enter cells (Fig. 3f, g), so we used the in vitro Chem-seq method to investigate binding of bioAT7519 to chromatin genome-wide. The results show that bio-AT7519 Chem-seq signals occur frequently at sites occupied by CDK9 (Fig. 3h, i). The bio-AT7519 Chem-seq signals were weaker than those observed for bio-JQ1, which may reflect differences in accessibility, association constants, or ligand-receptor sensitivity to sample preparation. Nonetheless, there was a correlation between bio-AT7519 occupancy and CDK9 occupancy genome-wide (Supplementary Fig. 6a). There were also a substantial number of sites that were not cooccupied by bio-AT7519 and CDK9; it is possible that this is due to the relatively weak signals we obtained for bio-AT7519 Chem-seq or to the fact that AT7519 can inhibit other kinases28, 30 that may occupy other genomic sites (Supplementary Fig. 6b). Notably, a comparison of the Chem-seq data for bio-AT7519 and bio-JQ1 with ChIP-seq data for various components of the transcription apparatus (CDK7, CDK8, CDK9, RNA Polymerase II, Mediator and BRD4) revealed that bio-AT7519 was most associated with CDK9, whereas bio-JQ1 was most associated with BRD4 (Supplementary Fig. 6c). These results suggest that Chem-seq can be useful for identifying the genomic binding sites of kinase inhibitors. To further extend the Chem-seq method to other drug classes, we investigated how the DNA intercalator psoralen interacts with genomic DNA in vivo. Recent studies have shown that psoralen preferentially intercalates at the transcriptional start sites (TSS) of active genes31, 32. We used the in vivo Chem-seq method with biotinylated psoralen (bio-psoralen) (Fig. 3j) to explore this observation genome-wide in MM1.S cells. The results confirm that bio-psoralen preferentially binds to the TSS of active genes (Fig. 3k–m). Thus, Chem-seq can detect local enrichment of DNA intercalating agents throughout the human genome. A broad range of drugs should generally be amenable to biotinylation and Chem-seq analysis. The design and synthesis of biotinylated probes can be informed by structural data from drug-target complexes; such X-ray structures allow the identification of suitable attachment positions that can be covalently linked to the biotin moiety and that remain freely accessible in the complex. Suitable attachment points could also be inferred from structureactivity relationship data derived from structurally related compounds. If such data are not available, several attachment sites can be selected for biotinylation, and the derived probes can be tested experimentally for their ability to retain binding to the target. Points of attachment can either be provided by functional groups already present in the drug molecule, or may be obtained through chemical modification of the compound structure, such as alkylation or addition of amide or ester linkages. Finally, as there is an expanding interest in Anders et al. Page 4 Nat Biotechnol. Author manuscript; available in PMC 2014 October 08. N IH -P A A uhor M anscript N IH -P A A uhor M anscript N IH -P A A uhor M anscript elucidating the mechanisms of action of many drugs, biotinylated versions of such compounds are increasingly becoming commercially available. In summary, Chem-seq provides a method to identify the sites bound by small chemical molecules throughout the human genome. When combined with other global analysis methods such as ChIP-seq, Chem-seq provides a powerful approach to investigate the direct, genome-wide effects of therapeutic modalities. This ability to map the global interactions of a chemical entity with chromatin genome-wide should provide new insights into the mechanisms by which small molecules perturb gene expression programs.
منابع مشابه
The Pattern of Linkage Disequilibrium in Livestock Genome
Linkage disequilibrium (LD) is bases of genomic selection, genomic marker imputation, marker assisted selection (MAS), quantitative trait loci (QTL) mapping, parentage testing and whole genome association studies. The Particular alleles at closed loci have a tendency to be co-inherited. In linked loci this pattern leads to association between alleles in population which is known as LD. Two metr...
متن کاملLOCALIZATION OF REOVIRUS CELL ATTACHMENT PROTEIN ?l ON THE SURFACE OF THE REOVIRION USING IMMUNOFERRITIN ELECTRON MICROSCOPY
Purified reovirus type 3 (strain Dearing) was treated with monoclonal anti-?l antibody conjugated to ferritin and examined in the electron microscope. Virion associated ferritin molecules corresponding to locations of the ?l protein were observed. Electron microsocpy of thin sections of these preparations revealed that ferritin conjugates were localized at the vertices of the viral icosahe...
متن کاملGenome Wide Association Studies, Next Generation Sequencing and Their Application in Animal Breeding and Genetics: A Review
Recently genetic studies have been revolutionized by next generation sequencing (NGS) technology, and it is expected that the use of this technology will largely eliminate defects in the methods of association studies. The NGS technology is becoming the premier tool in genetics. However, at the moment the use of this method is limited especially in the livestock due to high cost and computation...
متن کاملLocalization of Eigenvalues in Small Specified Regions of Complex Plane by State Feedback Matrix
This paper is concerned with the problem of designing discrete-time control systems with closed-loop eigenvalues in a prescribed region of stability. First, we obtain a state feedback matrix which assigns all the eigenvalues to zero, and then by elementary similarity operations we find a state feedback which assigns the eigenvalues inside a circle with center and radius. This new algorithm ca...
متن کاملIn Silico Genome-Wide Screening for TnrA-Regulated Genes of Bacillus clausii
Bacillus clausii TnrA transcription factor is required for global nitrogen regulation. In order to obtain anoverview of gene regulation by TnrA in B. clausii KSMK16, the entire genome of B. clausii was screened forthe consensus sequence, 5’-TGTNAN7TNACA-3’ known as the TnrA box, and 13 transcription units werefound containing a putative TnrA box. The TnrA targets identified in...
متن کامل